# Retrieval-augmented generation

All MiniLM L2 V2
Apache-2.0
This model is distilled from all-MiniLM-L12-v2, achieving nearly 2x faster inference speed while maintaining high accuracy on both CPU and GPU.
Text Embedding Supports Multiple Languages
A
tabularisai
5,063
2
Reasonir 8B
ReasonIR-8B is the first retrieval model specifically trained for general reasoning tasks, achieving state-of-the-art retrieval performance on the BRIGHT benchmark and significantly improving performance on MMLU and GPQA benchmarks in RAG applications.
Text Embedding Transformers English
R
reasonir
13.43k
39
Jbaron34 SmolLM2 135M Bebop Reranker Gguf
A lightweight text ranking model suitable for reordering search results or documents
J
RichardErkhov
855
0
Gte Qwen2 7B Instruct GGUF
Apache-2.0
A 7B-parameter multilingual text embedding model developed by Alibaba NLP team, specializing in sentence similarity tasks, offering multiple quantization versions
Large Language Model English
G
mradermacher
510
2
Pllum 12B Nc Chat
PLLuM-12B-chat is an optimized dialogue version with 12 billion parameters in the Polish large language model family. It is specifically designed for the Polish language and Slavic/Baltic languages, achieving safe and efficient interaction capabilities through instruction fine-tuning and preference learning.
Large Language Model Transformers
P
CYFRAGOVPL
2,673
6
Jina Embeddings GGUF
Apache-2.0
Jina Embeddings V2 Base is an efficient English sentence embedding model, focusing on sentence similarity and feature extraction tasks.
Text Embedding English
J
narainp
139
1
Granite 3.1 3b A800m Instruct
Apache-2.0
A 3-billion-parameter long-context instruction model fine-tuned based on Granite-3.1-3B-A800M-Base, supporting multilingual tasks
Large Language Model Transformers
G
ibm-granite
36.16k
24
Command R
C4AI Command - R is a research version of a high-performance generative model with 35 billion parameters, optimized for various use cases such as inference, summarization, and question-answering.
Large Language Model
C
cortexso
748
2
Gte Qwen2 7B Instruct
Apache-2.0
A large language model with 7B parameters based on Qwen2 architecture, focusing on sentence similarity calculation and text embedding tasks.
Large Language Model Transformers
G
Alibaba-NLP
169.82k
398
Llama 3 Typhoon V1.5x 8b Instruct
An 8-billion-parameter instruction model specifically designed for Thai, with performance comparable to GPT-3.5-turbo, optimized for application scenarios, retrieval-augmented generation, constrained generation, and reasoning tasks
Large Language Model Transformers Supports Multiple Languages
L
scb10x
3,269
16
Snowflake Arctic Embed M Long
Apache-2.0
Snowflake Arctic M Long is a sentence-transformers-based sentence embedding model focused on sentence similarity and feature extraction tasks.
Text Embedding Transformers
S
Snowflake
23.79k
38
Selfrag Llama2 7b
MIT
A 7-billion-parameter Self-RAG model capable of generating outputs for diverse user queries, adaptively invoking retrieval systems, self-criticizing outputs and retrieved passages, while generating reflection tokens.
Large Language Model Transformers
S
selfrag
1,318
78
SGPT 125M Weightedmean Msmarco Specb Bitfit
SGPT-125M is a sentence transformer model optimized with weighted mean and bitfit techniques, focusing on sentence similarity tasks.
Text Embedding
S
Muennighoff
4,086
2
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase